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I. I5;»sis of rlie report 



W'llh iviiiial to the elcnuMits iiftlic intcnijilioniil <ipp!ic<itioM: 
iho iiiioniiitit>!i;tl nppliciition a> originally tiled 

I I 1 ho description : 
panics 
paj:cs 
pa*ics 

I I t lie claims: 
pages 
pages 

pages 

I I the (.Ira w ings: 
pages 

pages 

I I Ihe sequence listing part of the description: 
pages 
pa ge.N 
pages 



, as orisiiiiallv tiled 



filed with the demand 



, filed with the letter of 



, as oriizinallv filed 



, as amended (together with any statement) tinder nrticle 1 9 

, filed wilii the demand 



filed with the letter of 



as oriGinallv tiled 



, filed with the demand 



(lied with the letter of 



as on 2 ilia Uv filed 



tiled with the demand 



filed with the letter of 



With regard to the liiiij;uii<^c, all the elements marked above u ere available or tumisiied to this Authority in the language iii which 
the interna tit ma I appUcatitui was lilcvl. unless otherwise indicated under this item. 

I hese eleiiiems were a\'a liable or luniished lo this Aulhoritv in Ihe rolltnvin<2 laiiizuiiiie which is: 



I I the language of a translation llLrnislied for the puipo>es oT international search (under Rule 23.1(b)). 
I I the language ot' publication (U'the international application (luider Rule 48.3(h)). 

I I the language of the translation lliniished tor the purposes of international preliminapv- examination (under Rules 55.2 and 
' ' or 55.3). 

3. With regard to any nucleotide and/or amino acid .sequence disclo.sed in the inteniational application, the intenintional 
preliminary examination was carried out on the basis of the .sequence listing: 

I I contained in Ihe inteniational applicatit^n in writlen Ibrm. 

I I filed together with the international application in C(Mnputer readable form. 

I I furni.shed subsequently to this Authority in written tonn. 

I I furnished subsequently to this Authority in computer readable fonn. 

I I 1 he statement that the subsequently luniished written .sequence listing dt^es not g(> beyond the disclosure in Ihe 
' ' inteniational application as tiled has been furnished. 

□ I he statement that the inlonnation recorded in computer readable fonn is identical to the writlen sequence listing ha> 
been furnished. 

4. I I I he amendments haw rcNulted in the cancellation o\\ 

I I the descriptitm. pages 
I I the claims, Nos. 

□ 



the drawiiiiis. sheet Iil: 



nl hi.^ reixirt has been eslabli.^hcd as if (.some oO the amendments had not been made, since they lia\e been coiiNidercxl <io 
be>'ond the di.^clo^u^■e as filevl. as indicated in the Supplemental Uox (Rule 70.2 (c)).*"*' 

*^ l<cplnvcnii.'nf .sheets wfiich haw hccn ft/r/iis/ia/ ttt the rccciviw^ (')ff\LL' in response to an invitdlion nmler Artiele 14 are referred fo 
in this report (ts "ori^inctlly filed " (ind itre nnnesed tu this report since titer do not contain (intendments f Rules 70. 16 
and 7(1 1 7 1. 

* . \ny repUteement sheet ucfnfainitiy such anieitdntents must he referred to under item I and annexed to this report. 
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^^H||^tt(Mici I application No. 
PCT/SEOO/00482 



\ . Ui'iisoiicd sfarciiR'ut under Article 35(2) with rejiurd to novelty, inwntive step or industrial applicai>iiit>-; 
citations and explanations siipportini: sncli statement 



1. Slatcincnl 

Nov el ly (N) 



Claims 
C'hiiin.s 



1-15 



NO 



ln\cnii\c step (IS) Claims 1-15 VI-S 

Claims NO 

IiKkisirial applicability (lA) Claims 1-15 Yl-S 

Claims NO 



2. Citations and cxp!anati(ins (Rule '07) 

Prior art 

The prior art, cited in the search report, consists of the following 

d:: cun\ent : 

iLi^ US 5 517 641, A 

\L'2^ US 5 84 2 2 09, A 

US 5 423 035, A 



Dl describes a data table re-organisation method for a computer 
storage system. D2 describes a method fzr visually depicting join 
relationships in a database system. D3 describes a method for 
performing relational database qualifications on a database with 
table structure. However, none of the documents describe a method 
for identifying common variables in different tables and connecting 
"he variables in a structure, as claimed. Therefore, they mterely 
•.:iefir.e state of the art. 



Statement of reasons 

The documents D1-D3, or any combinatic-n of them, do not describe 
surjh a method for extracting information, as claimed in claims 1-14; 
or such an article, as claim.ed in claim. 15. There is also no 
~ea:;hing in the cited art I'eading a skilled person to this method or 
ar"i:le. Therefore, the claimed inventic^n is novel and involves an 
in"ent ive step . 

Acr-: rdingly, claims 1-15 are novel IN} and fulfils the requirements 
of inventive step iI5/ and industrial applicability ''lAj. 
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A method operiues on a database to extract aiid present 
information to a user. Tne database comprises data tables containirg 
values of a number of variables. Tiie infomiation is to be extracted 
b> evaluating at least one mathematical function which operates on 
one or mors, selected caiculation variables. Tne presenicc information 
is to be panitioned or. one or n^ore selected classification variables. 
The method comprises the steps of identifying all boundan.' tables: 
identifying ali connecting tables: electing a staning table among 
said bound:Hr> and connecting tables: building a conversion stnicaire 
that links valuei; of each selected variable in the boundar>' tables 
to corresponding values of one or more connecting variables in the 
j starting table: and evaluating the mathematical function for each data 
record of the staning table, by using the conversion structure, such 
rha: the evaluation yields a final data structure containing a result of 
the mathematical function for ever>' unique value of each classification 
variable. 
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KETHOC FOR EXTRACTING IN? ORrJ^-TI Oi: FROK A DATABASE 



Technical field 

The presenc mvencicn relanes no a mechcd for 
extracting mformacicn from a dacabase . The dacabase 
E comprises a number of data tables cone a i nine values of a 
number of variables, each dai:a cable consisting of ac 
least one data record including at lease cv;o of said 
values. The information is extracted by evaluation of at 
least one miathem.acical function, v;hich operates on one or 

10 more selecced calculation variables. Further/ the extrac- 
ted infcrmauion is parcicioned on one or m.cre selected 
classif icacicn variables . 
Background of the invention 

It is often desired to extract soecific information 

15 from a database that is stored on a secondary mem.ory of a 
computer. More specifically, there is need to summarise a 
large am.ount of data in the database, and present the 
summiarised data to a user in a lucid v;ay . For exam.ple, a 
user mtight be interested in extracting total sales per 

2G year and client from, a database including transaction 

data for a large compan\*- Thus, the extraction involves 
evaluation of a m.athemat ical function, e.g. a summ.ation 
( "SU^: (x^y) ) , operating on a comibination cf calculation 
variables (x, y) , e.g. the nurrb)er of sold item.s 

25 ( "Numjcer" ; and the price per item. (''Price'''' . The extrac- 
tion also involves partitioning the informiation according 
to classification variables, e.g. "Year'' and "Client". 
Thus, the classification variables define hov; the result 
of the mathemiat ical operation should be presented. In 

30 this specific case, the extraction cf the total sales per 
year hy client v.^ould involve evaluation cf 
"SUM (Number* Price) per Year, Client". 

In one prior-art solution, a com.puter program is 
designed to process the database and to evaluate all 
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20 



30 



conceivable n:a.her.ac.cal func.xons ooerar.nc, or aH 
concexvabie calculation variables part.cicned on all 
conceivable classif icacion variables, also called 

dimensions. Th'=' -esi^'i- t-n-;-- 

-e^^i. tr.i^ operation is a larae data 

struccure co-rronly known as a multidimensional cub- This 

muitidinaensional cuoe is obtained through a very cime- 

consuming ooera-io^ v^n^rh -.^o.-^-in.. ■ 

^ . ---o.., ^ypiccxliy IS pericrmed over- 

night. The '^uh^ ^on---'— f^u^ 

^ ^^'^ -on^c^_..^ the evaiuacea resulcs of the 



tne occurring values of cne classification variables. The 
user can then, m a different computer prograrr. ooeratina 
on the multidimensional cune, explore the data of the 

database, for examo^ e bv v-* c^^^ ~ ^ ^ ^ - ^ . 

ci..._^_^ \ ^^ac^^zsmg seiecreo data in 

pivot cables cr crrao>- ^ 1 - - or- _^ ^ -.^ 

J cino cnarcs. When th'^^^ 

user defines a machematical function and one or mo^e 
classification variables, all other classification 

variables ar^ e • ^ rr-' -^^-^h r--^^^ 

m-..cxuea cnrougn a summation over the 

results sco"^-^ -n *-n- r^-.-r-.^ - ■ ^* 

-^^ ^n^ cucc: _or tnis mathemacical func- 
tion, i:he summation being made for all other olass^fx- 

cacion variab~=^=^ ^-r,<:^ v,-, 

...us, ^,.y caamg or removing classifi- 
cation variabl-'^ -h^ ^^-^-^^ 

-.^-i- wi^-^ ^ciii move up or aov;n in the 

dim.ensions of the cub- 

Tnis approach has some undesired limitarions. If -he 

miuicidimensiona' c-b^ ^^-^.^ 

- ^-^^ ^vc^^aacion concams average 

quancicres,, e.o. che ^"-^>-ac^ ^--^^ ^-^.-^ 

^c^-e::. -'c^rui ^lonec on a 

numJoer of class f ^* or '-->^,--v- ^ , 

^'ciriciO-eS/ rhe user cannoc 

eliminate one c-r- r^^-r^ --f - • - . 

m.^r. ...^se cassir icac ^on variables 

since a summation o-^^^^ -v^-r^--^ 

^.w^ ^veraM" qucincities ooes noi vield 

a correct cecal av^-^-^:^--^ --^^ 

-^c:i^^. ^n-.s case, cne m.ulcidimten- 

sionai cube mius^ — = - t-v^^ 

- rr.^ average quancitv solic on 

every conceivable -q-^- ^ ^ ^ - ^ - 

■'^""■^ '^^-^assicicacion variables 

as well, addinc an ex^-ra ^-^-v,^-, . . 

_ - -^^--c; ^^.._it=x^c/ ^o tne ooeracion of 

buiicmg Che multidimensional cuoe . Tne same orob'- em 

arises for other quantities, e.g. m.edian values. 

Often It IS difficult to predict all relevant math-- 

miacical funcrSn-^^ ■ - ^ 

^-o._, w->^ulc;,ion variables and classifica- 
tion variables befr^-^ rr^^^i^-r,^ - f.^^- 

..t^K^nn a tirs^ exam.inacion of the 
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data ir. zhe da -abase. Upon idenzifying trends and 
pacuerns, the user might find a need zo add a function or 
a variable zo reach underlying details in the daca . Then, 
the :: irr.e - consuiTLing procedure of building a nev; multi- 
5 dimensional cube muse be miciated. 
Summary cf "he invention 

Acccrdingl:.' , zhe object of che present Jinvencion is 
to micigate the above -m.entioned drav;backs and to provide 
a m.ethod for excracuing information from a database, 

10 v;hich method allov/s "he user zo freely select miathemiati- 
cal functions and incorporate calculation variables in 
these functions as v/eli as to freely select classifica- 
tion variables for presentation of the results. 

This obnec" z.s achieved by a miethod having the 

15 features reciced in mdependenc claim. 1. Preferred 
embodim.eni: s are reciced in the dependent claims. 

According zo the presenc invention there is provided 
a method for generating a final data scructure, i.e. a 
multidimensional cube, from data in a database in an 

2 0 efficient v;ay, v.*ith respect co both process time and 

memory requirement . Since zhe cube can be generated much 
faster chan in prior-art solutions, it is possible to 
generate mulcidimiensional cubes ad hoc. The user can 
interactively define and generate a cube v;ichouc being 

2 5 limited to a ver\' sm.all number of m.at hem;at ical f unci ions 

and variables. The mathem.acical function is normally 
composed of a crmtb-inat ion of m.aihemiat ical expressions. If 
che user needs co modify ihe m;a them.ac ical funciion by 
changing, adding or removing a m.achemiac i cal expression, a 

3 0 nev.' cube can norm.ally be generated m a timte short enough 

not zo disturb the user in his v.'crk. Sim.ilarly, if the 
user desires to add or remove a variable, the cube can be 
rapidly regenerated . 

This is achieved by a clever grouping of all rele- 
3 5 vant data tables into boundary tables and connecting 
tables, respectively, based on the t;^^pe of variables 
included in each table. 3y electing one of these tables 
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v^o^i.u c^_a ov Daiid-ng an aDprc::)riai:e 

conversion scructu-e rhe f^n-i ^ ^_ * " ' 

-^'^ r-nc^ ac.c srruccure can be 

e^ricienciy generated fron^ ^b^ ^-.^^.^la --r.-^^ k 

^i.^ s-^c^^^mg L.ciO_Le bv use of 

::he conversion srruccure. 

Preferably, zhe data records of the dataoase a^-e 
^^rst reao .nto the primary memory of a oom.puter so that 
the da.a can be prooessed off-Ixne. This v.-ill further 

reduce the fimf== for- qc-t-^-i^ ^„ 

" se=.rch^ng zne database and generating 

une rznai cata structure. The database mav be stored on a 
seconaary memory or be a remotely stored database 
whxch the oompucer is connected bv a modem. Iz i= to b^ 
understood that the database chus read xnto the orrm.ary 

memorv mav be a s-'e-r^d t . ' 

. - a s^.e.t^d Pert o_ a larger database or a 

oomcmation of cv.'o or more dacabases. 
15 In one preferred embodiment, each different va-ue o^ 

each aata variable is assigned a binary code and the data 
records are stored in binary-coded form. On account of 
the cmary coding, very rapid searches can be conducted 

m tne daca tables. Moreov^- • p 

, -^-^N/^^, -ea^na=.r_ mr ormat :Lon can 

t-'e removed r^'=i}'li--^r- >^ 

^a, resulting .n a reauced amount of dara. 

In another ore'^ = ^--^d ^rrrr-^r^^- r-,^^^ ^ - i_ 

^ ^mooa^m.ernr ^ a^i bounaary and 

connecting cables thac mc^ud- -■ i ^ - , • 

--i^_ua^ -ci^^^lciL,ion variables with 
a need for ^^erru^=^-nr--' h-^-- - . , - 

^-eqa^ncv dc^^, i.e. variaoles for whi-h the 

number of duplicates o- e-c^ ^^ajM- . - 

w_ ^w^ui. vc;ia~ IS necessarv lo"^ 

correct evaluacion of the m,athem.atical function , 'define a 

subset. Bv e*^ ^=^r-^ ^' n-r -'y^-^ ^' 

^/ -.e.t..ng .n^ smarting table from this subs-t 

and by including frecu-r-v dP-- -^^^ 

^ ^_^-n^v Qd^c m une conversion struc- 
ture, mem.orv-e^-^T C-" en^ Qr-^r-=ir^'^ ^-p - 

N c c^en^ :=coragc: of aupiicates can be 

acnieved when buildmq th^ ^'np~ - 

— ^_iici_ a.3i.^ci Structure. 

-n the conversion st^ucrn>-^=. ^-^^ 
,^ , , , ^ ----u^^u.c, .„ne ^lequency data can 

— ^^c_Ludec bv dun" •i'"=ti--'r--r- 

^.v aup_.^^^_o,i O.L eacn value, i.e. the con- 
version structure v' i 1 , — ^^^-.-^ „ - • , ^ 

'--x-^cii. c __nK from each valu^ o^ 

a connecting variable m the startmo t->^^- 



25 



_ -ciO-ie to a C'^"*^rec 
numJoer of each value of at least one corresoo-d^ no " 
selected variable m a boundary table. Alters- v^lv - 
oounter m.ay be included in the conversion stru^tu-^^o" 
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each unique value of each connecting variable in the 
starring cable. 

Preferably, uhe boundary or connecting cable having 
che largest number of data records is elected as smarting 
5 table. This tends to minimise the amount of frequency 
data thai: must be incorporated in the conversion struc- 
ture, which therefore can be builc more rapidly. 

In a further preferred embodimenc , a virtual data 
record is creaced by reading a data record of the 

10 starring table and by using nhe conversion structure to 
convert each value of each connecting variable in this 
data record into a value of at lease one corresponding 
selected variable. Thereby, the virtual data record will 
contain a current combinauion of values of the selected 

15 variables. The final data structure can be gradually 
built by sequentially reading dara records from^ the 
starring table, by updating the content of the virtual 
data record based on the content of each such data 
record, and by evaluating the mathemiat ical function based 

20 on the content of each such updated virtual data record. 
This procedure miinimises the amount of computer m.em.ory 
that is needed for extracting the requested information 
from the database. Further, virtual data records contain- 
ing undefined values, so-called K'Jhh values, of any 

25 calculation variable can often be immediately removed, in 
particular when all calculation variables exhibit NULL- 
values, since in miany cases such values can nor be used 
in the evaluation of the mathematical function. This 
feature v.'ill contribute to an optimised perf orm.ance . 

3 0 In another preferred emjDodiment , an interm.ediate 

data structure is built based on the content of the 
virtual data record. Each data record of the incermediate 
data structure contains a field for each selected classi- 
fication variable and an aggregation field for each 

35 mathematical expression included in the miathemat ical 
function. For each updated virtual data record, each 
miathemat ical expression is evaluated and the result is 
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, Ppropricte aggregation field based on 

trie ciiirire'^'" \'^~'^-^ - ctj-, ' 

^ ' ---^ ^acr. seiec-ed classif icacion 

variable. Sucr. ar it-.- ^^-rn^z.^ - - - - ^ 

1 m^a.c.^ Qc^.c scruccure allows the 

user CO cornbine mathemai ica - e-o-e-c^^-n- v-^h>- ^.-^ 
need for frequency data in one rr.a.hemacxcal func.xor "sy 
ou.la^ng several conversion scruccures xncoroorar ■: no 
corresponding frequency data, and bv evalua.lno cne'^d^ca 
records of a starting table for eacn such matrlr^.ac . ^a"^ 
expression cased on a corresponding conversion scruc.ure, 
-t ts possible CO merge the results of these evaluations' 
m one intermediate data structure. Likewise, if the us^r 

modifies the mathematical f-^n^r-^-, -^o^^- 

i-t^o-i oy ddamg a new mathe- 

uiatical expression ooerat— .3 on n . ^ 

— cilrcaay selecteo ca^ - 

'— — ^li V di _ — C'l e- ""■ c; , 

w — V .;=ce3sarv to add an 
aggregation field to the existing intermediate data 
structure, or to extend an existing aggregation field. 

It snould be ^or^n ^h-f- 

^.oc^a .^nctt .ne virtual aata record in 

general is indeed \H-m^] - ^ • 

^-e. It IS noc pnysicallv 

ailocatec anv memorv r-i^ir--^ -v--- - - 

^u.o^v, ca^in^ ^n- transition rrom a da^a 

record of the starting table to tne final data structure 

However, sucn a virtual data record can alwavs, at leas^' 

-npiicit_y, oe identified m tne procedure of converting 

the content cf a data record of the startina tabl^ -,to 

current values of the selected variable. 



-^escri T^r- 1 



^referred e~bodime-^r ; 



The nresent - nvonr ^- --^ii 

--1V v.x^i now oe aescriJDed bv way 

of examples i-^f^^^-o-^^^ k--^-, 

- --.--^n„^ r>=..ng mace to the tables of 

Appendix A and -o F-c:=; -> 9 

1-^ Oi tne arawmgs, '^ig 1 

showing the content of a database after iden- -: ^ - = - ^ 
relevant data tables according tc the inventive m^t^- ^' 
and Fia ~ ^'r- ~ . 
. . " - " se^.uence or steps cf an emJDcdim.ent 

--^_n^ _r.e invention. 



s shown in Fig. i, comprises a numb>ei 



A da'^aoase, as sh 

of data tables (Tables 1- = ,- o 

- 1 _.. . ^cc. a=,-^- tao_e contains 

data values cf a number of data variables 
in Table 1 each data record contains data values of the 

data variables "Produce" '^^o^^ 



ci v/c;riar)_es. ?or example, 
alues of the 
and ^'?arc" . If there is 
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no specific value in a field of the data record, this 
field is considered to hold a NULL-value. Similarly; in 
Table 2 each dara record contains values of the variables 
''Dai:e'\ ''Client", '^Produce" and "NurrjDer" . T\^ically, the 
5 data values are stored in che form of ASCII -coded 
strings . 

The method according to the present invention is 
implemented by means of a computer program. In a first 
step (step 101), the program reads all data records in 
10 the database, for instance using a SELECT statement which 
selects all the rabies of the database, i.e. Tables 1-5 
in this case. T\'pically, the database xs read into the 
primiary memory of the comtputer. 



15 uhau each unique value of each dara variable in said 

database is assigned a different binary code and that the 
data records are stored in binary-coded form (step 101) . 
This is t^^ically done when che program; first reads the 
data records from the database. For each input table, the 

20 following steps are carried our. First the column names, 
i.e. the variables, of the cable are successively read. 
Every time a nev; daca variable appears, a data structure 
is instantiated for it. Then, an incernal table structure 
is instantiated to contain all the data records in binary 

25 form., whereupon the data records are successively read 

and binary-coded. For each data value, the data structure 
of the corresponding data variable is checked to estab- 
lish if the value has previously been assigned a binary 
code. If so, that binary code is inserted in the proper 

3 0 place in the above-mentioned table structure. If not, the 
data value is added to the data structure and assigned a 
new binary code, preferably the next one in ascending 
order, before being inserted in the table structure. In 
other v/ords , for each data variable, a unique binary code 

35 is assigned to each unique- data value. 



To increase the evaluation speed. 



It 1 



s orererrea 
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Tables 6-12 o~ A-n.y-^^-^-^ ^ ^- r ... 

o_ ^.pp^n^xx A snov; cne omarv codes 

assigned to diff^>-^n^ d = r- -^^-.^.^ 

- a^cc: v^ia-^ cr some aata variables 

*i.^^^u^_ ^n_. aa-abase or Pig. l. 

havi^ng read all data records che database, 
o rne prograrr. analyses the da-ab-s- — • s^-^r^-^^ -^11 

txons becweer. .he data tables (step 1C2). A ;onnect^:or 
between nwo data tables rr.eans that these data cables hav- 
one variab"^ ^ ^ ^-mmr--,-! r^^-P- 

j-^-j L^n^ a.igorirnms rcr Derforrn- 

ing such an ana'^vsi^ ^-r^^. i-r.r^-T« ■ - 

cix.ci^vsi^ are known m tne arc. Afcer che 

10 analysis all daca rabi^c- ^>-^ ^ - , 

-c^c. .ao^es crtr v_rcuaily connecced. in F^q 

1, such vircual connect io-^c^ i 1 ^- - ^ ^ - ^ * ' 

enaed arrows (a) . The virtually connected data tables 
Should for. at least one so-called snowflake structure 

" ^---—9 data structure xn wh.ch there .s one and 
only one connecting path between any cv.-o data tables r 
the database. Thus, a snowflake structure does not con- 
tain any loops, if loops do occur amonc the vtrtua^ Iv 

connected data tables, - a ---oi^ 

^/ L.WO ucioles have more than 

one varxable .n common, a snowfla.v.e structure can xn so.e 

hro" •;';:-'':J°r'' r""' algorithms 

After -his mitral analysis, the user can star^ to 
explore the database. In doing so, che user def-:n^s a 

matnerr.atical function v^Hr--. --.-i-^ 

' ^---^c.. .,^uia oe a cornomat ion of 

^= mathema- - c- ^-.-n-^o-- . ^--^ 

— ^^'^-p . .^-ssume tnat the us = "- 

wants to 'i>--^="- "K- - - 

— - iP^- y-a.- anc client r'^'^~ 

the database -^n ^■■-.^ - - 

-—-5- -. -n= user aettnes a corresoondinc: 

matnematical function "SUM (>:*y ) " , and selects the'caic^^^ 
variables to be included m chis function: "Pri^e" 
^0 and "l.u.^er". The user also selects the classif icat-: on 
variables: "Client" and "Year". 

The corr:ou^-^ ■n^-o'--^--- ■ - • ^. 

^ p-o^.^.. tnen laentifies all relevant 

data tables (steo iO'i) - = --'-i ■ -, 

. , , a^ta taoles contai-inc 

any one of the s="oc-f=.o - , 

^=i-^-'-ia.ion ana classification 

variables, such data cables being denoted bounda^v 
tables, as well as all intermediate data tables ir the 
connecting path(s) between these boundarv tables m the 
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snowflake structure, such data nabies being denoted 
connecting cables. For the sake of clarity, the group of 
relevant data tables (Tables 1-3) is included in a first 
frame (A) in Fig. 1. Evidently, there are no connecting 
5 tables in this particular case. 

In the present case, all occurrences of every value, 
i.e. frequency data, of the selected calculation 
variables must be included for evaluation of the mathe- 
matical function. In Fig. 1, the selected variables 

10 ("Price", '"NurrjDer" ) requiring such frequency data are 

indicated by bold arrows (b) , whereas remaining selected 
variables are indicated by dotted lines (b' ) . Now, a 
subset (3) can be defined that includes all boundary 
tables (Tables 1-2) containing such calculation variables 

15 and any connecting tables betv/een such boundary tables in 
the snowflake structure. It should be noted that the 
frequency requirement of a particular variable is deter- 
mined by the mathematical expression in which it is 
included. Determination of an average or a median calls 

2 0 for frequency information. In general, the same is true 
for determination of a sum, v/hereas determination of a 
mtaxim.umi or a miinimum does not require frequency data of 
the calculation variables. It can also be noted that 
classification variables in general do not require 

25 frequenc\- data. 

Then, a starting table is elected, preferably among 
the data tables within subset (3) , most preferably the 
data table with the largest numj^er of data records in 
this subset (step ICS). In Fig. 1, Table 2 is elected as 

30 the starting table. Thus, the starting table contains 

selected variables (''Client", ''NumLoer" ) , and connecting 
variables (''Date", ''Product"). These connecting variables 
link the starting table (Table 2) to the boundary tables 
(Tables 1 and 3} . 

35 Thereafter, a conversion structure is built (step 

106), as show^n in Tables 13 and 14. This conversion 
structure is used for translating each value of each 
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ccnnecdr.g vari-able : ^^Dace'S ''Produoz'' ) ir. »ne szarcing 
cacle 'Table 2) mco a value of a ccrrespcnd^ng selected 
variable : ^^Year'\ ^^Price'M i- che boundary raoles (Table 
3 and 1, respectively;. Table 12 is ouilc by successively 
5 reading aaca records of Table 3 and creating a link 
be"v;een each unique value of zhe connec-mg variable 
(^^Date") and a corresponding value of the selected var- 
iable (^^Year"). It can be noted that there is no link 
fron: value 4 f^^Date: 1 9 9 9 - C 1 - 12 . since this value is 
10 not included in the boundary- table. Sirhlarlv, Table 14 
IS built hy successively reading data records of Table 1 
and creating a link betv/sen each unique value of the 
connecting variable -"Produtt") and a corresponding value 
of the selected variable (^^Price"- . In this case, value 2 
15 {^^Product: Toothpaste"; is linked to two values of the 
selected variable {^^Price: £.5"), since this connection 
occurs twice m the ooundary table. Thus, frequency data 
is included in the conversion structure. Also note that 
there is no link from value 3 {"Product: Shampoo'^) 

When the conversion structure has oeen built, a 
virtual data record is created. Such a virtual data 
record, as shov.tn in Table 15, acco:nrrodat es all selected 
variables '"Client'v "Year", "Price", "Number") in the 
database. In ouildmg the virtual data retcrd :step 107- 
25 Ica: , a data rettrd is first read from tne starting cable 
(Table 2} . Then, the value of each selected variable 
("Client", "Number'' m the current data record of the 
starting table is incorporated m tne virtual data 
record. Also, by using the ccnversicn structure (Tables 
3C 13-14) each value of each connecting variable ^"Date", 
"Product" - in the current data record of the starting 
table IS ctnverted into a value cf a c t rrespondmg 
selected variable '"Year", "Price"), this value also 
oemg incorporated m the virtual data record. 
^- "his stage ;step 109; , the virtual data record is 

used to build an intermediate data structure (Table 16) . 
Each data record cf the int erm.ediate data structure 
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accommodates each selected classification variable 
(dimension) and an aggregation field for each mathemati- 
cal expression implied by the mathematical function. The 
intermediate data structure (Table 16) is built based on 
the values of the selected variables in the virtual data 
record. Thus, each mathematical expression is evaluated 
based on one or more values of one or more relevant 
calculation variables in the virtual data record, and the 
result is aggregated in the appropriate aggregation field 
based on the combination of current values of the classi- 
fication variables ('^Client", "Year") 

The above procedure is repeated for all data records 
of the starting table (step 110) . Thus, an intermediate 
data structure is built by successively reading data 
records of the starting table, by incorporating the 
current values of the selected variables m a virtual 
data record, and by evaluating each mathematical 
expression based on the content of the virtual data 
record. If the current combination of values of classifi- 
cation variables m the virtual data record is new, a new 
data record is created in the interm.ediate data structure 
to hold the result of the evaluation. Otherwise, the 
appropriate data record is rapidly found, and the result 
or tne evaluation is aggregated in the aggregation field. 
Thus, data records are added to the intermediate data 
structure as the starting table is traversed. Preferably, 
the intermediate data structure is a data table associa- 
ted with an efficient index system, such as an AVL or a 
hash structure. In most cases, the aggregation field is 
implemented as a summation register, in Vv'hich the result 
or the evaluated mathematical expression is accum.ulat ed . 
In some cases, e.g. when evaluating a median, the 
aggregation field is instead implemented to hold all 
individual results for a unique comjDination of values of 
the specified classification variables. It should be 
noted that only one virtual data record is needed in the 
procedure of building the intermediate data structure 
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=rom the scarcing "able. Thus, zhe conzenz of the vircual 
da-a reccrd is updaced for each dara record of the 
scarting cable. This v;ill minimise the memory requiremenr 
in execucing zhe compucer program. 

The procedure of building che intermediate daca 
structure v;ill be further described v/ich reference to 
Tables 15-15. In creating the first virtual data record 
Rl, as shown in Table 15, the values of the selected 
variables ^^Clienf' and "Number" are directly taken from 
the first data record of the starting table (Table 2) . 
Then, the value 1 999 - C 1 - 02 " of the connecting variable 
"Date" IS transferred into the value "1999" of the 
selected variable ^^Year" , by m.eans of the conversion 
structure (Table 13) . Similarly, the value "Tootnpaste" 
of the connecting variaole "Product" is transferred into 
the value "6.5" of the selected variable "Price" by means 
of the conversion structure (Table 14), thereby forming 
the virtual data record ?:1 . Then, a data record is 
created m the intermediate data structure, as shown in 
Table 16. In this case, the intermediate data structure 
has tree columns, two of which holds selected classifica- 
tion variables ("Client", "Year"). The third column holds 
an aggregation field, m v/hich tne evaluated result cf 
the mtathemtatical expression ( "x-y" ) operating on the 
selected calculation variables ("Number", "Price") is 
aggregated. In evaluating virtual data record Rl , the 
current values (binary codes: C,C, of the classification 
variables are first read and incorporated m this data 
record of the intermediate data structure. Then, the 
current values (binary codes: 2,0) of the calculation 
variables are read. The m.athemiat i cal expression is 
evaluated for tnese values and added to the associated 
aggregation field. 

Next, the virtual data record is updated based on 
the starting table. Since the conversion structure (Table 
14) indicates a duplicate of the value "6.5" of the 
selected variable "Price" for the value "Toothpaste" of 
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the connecting variable "Produce" , zhe updated virtual 
data record R2 is unchanged and identical to Rl . Then, 
the vircual dana record R2 is evaluated as described 
above. In this case, the intermediate daca structure con- 
5 tains a data record corresponding to the current values 
(binary codes: 0,0) of the classification variables. 
Thus, the evaluated result of the mathematical expression 
is accumulated in the associated aggregation field. 

Next, the virtual data record is updated based on 

10 the second data record of starting table. In evaluating 

this updated virtual data record R3 , a nev/ data record is 
created in the intermediate data structure, and so on. 

It should be noted that NULL values are represented 
by a binary code of -2 in this example. In the illustra- 

15 ted example, it should also be noted that any virtual 

data records holding a NULL value (-2) of any one of the 
calculation variables can be directly eliminated, since 
NULL values can not be evaluated in the mathemiat ical 
expression ( ''x*y" ) . It should also be noted that all NULL 

20 values (-2) of the classification variables are treated 
as any other valid value and are placed in the inter- 
mediate data structure. 

After traversing the starting table, the inter- 
mediate data stx'ucture contains four data records, each 

25 including a unique combination of values (0,0; 1,0; 2,0; 

3,-2) of the classification variables, and the correspon- 
ding accumulated result (41; 37.5; 60; 75) of the evalua- 
ted miathemat ical expression. 

Preferably, the int ermiediate data structure is also 

30 processed to eliminate one or m.ore classification 

variables (dimensions) . ?referacl\\ this is done during 
the process of building the intermediate data structure, 
as described above. Every tim.e a virtual data record is 
evaluated, additional data records are created, or found 

35 if they already exist, in' the intermediate data struc- 
ture. Each of these additional data records is destined 
to hold an aggregation of the evaluated result of the 



wo 00/55766 



PCT/SE00/O04S2 



15 



3 0 



35 



14 

mathema-icai expression for all values of one or more 
classificacion variables. Thus, when zhe srartina table 
has been traversed, the intermediate data structure will 
contain both the aggregated results for all unioue 
5 combiaa-.xons of values cf the classification variables, 
and the aggregated results after elimination of each 
relevant classification variable. 

This procedure of elim.inating dim.ensions in the 
intermediate data structure will be further described 
10 with reference to Tables 15 and 15. when virtual da^a 
record Rl is evaluated (Table 15) and the first data 
record (0,0) is created m the intermediate data struc- 
ture, additional daoa records are created in this struc- 
ture. Such additional data records are destined to ho": d 
the corresponding results when one or more dim.ensxon= a^e 
elim.inated. In Table 16, a classification variable s 
assigned a binary code of -l in the intermediate data 
structure to denote that all values of this variable a-e 
evaluated. In this case, three additional data records 
20 are created, each holding a new comJ^ination of values 

(-1,0; 0,-1; -1,-1) of the classification variables. The 
evaluated result is aggregated in the associated aagreaa- 
tion field of these additional data records. The first"^ 
(-1,0) of these additional data records is destined to 
2= nold tne aggregated result for all values of the clas- - 
rication variable "Client" when the classification 
variable "Year" has tne value "1S99" . The second (0,-i) 
adaitional data record is destined to hold the apareoated 
result for all values of the classification variable 
"Year" when the classification variable "Client" ^s 
"Nisse". The third - 1 , - 1 ) additional data record is 
aestined to hold the aggregated result for all values of 
botn classification variables "Client" and "Year". 

When virtual data record R2 is evaluated, the result 
IS aggregated in the aggregation field associated with 
the current corrb-ination of values (binarv codes: C,0^ of 
the classification variables, as well as in the aogre- 
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gation fields associated wich relevant additional dana 
records (binary codes: -1,0; C,-l; -1,-1). When virtual 
data record R3 is evaluaced, che resuic is aggregated in 
the aggregation field associated v/ith the current combi- 
5 nation of values (binary codes: 1,0) of the classifica- 
tion variables. The result is also aggregated in the 
aggregation field of a newly created additional data 
record (binary codes: 1,-1) and in the aggregation fields 
associated v;ith relevant existing data records (binary 

10 codes: -1,0; -1,-1) in the intermediate data structure. 

After traversing the starting table, the inter- 
mediate data structure contains eleven data records, as 
shown in Table 16. 

Preferably, if the intermediate data structure 

15 accommodates more than two classification variables, the 
intermediate data structure v/ill, for each eliminated 
classification variable, contain the evaluated results 
aggregated over all values of this classification 
variable for each unique combination of values of remain- 

20 ing classification variables. 

When the intermediate data structure has been built, 
a final data structure, i.e. a multidimensional cube, as 
shown in non-binary notation in Table 17, is created by 
evaluating the mathematical function ("SUM(x*y)") based 

25 on the results cf the mathematical expression ( ''x*y" ) 

contained in the intermediate data structure (step 111) . 
In doing so, the results in the aggregation fields for 
each unique comibmation of values cf the classification 
variables are combined. In the illustrated case, the 

3 0 creation of the final data structure is straightforward, 
due to the trivial nature of the oresent mtathemat ical 
function. The content of the final data structure m.ight 
then (step 112) be presented to the user in a two- 
dimensional table, as shown in Table IS. Alternatively, 

35 if the final data structure ■ contains mtany dim.ensions, the 
data can be presented in a pivot table, in which the user 
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xn-racrxvely can move up and down zn dimensions, as is 
we^l known in che art . 

Below, £ second example of the inventiv- Ts-hod i~ 
described with reference to Tables 20-25. The d^s'^-.^t^-^ 
wi^- only elaborate on certain aspects of this exarroi^ ^" 
namely building a conversion structure ---:ud--a ^ = t- 
from connect:Lno tables, and build.ng an xntermedia-\a-a 
structure for a more complicated mathematical -un^^^on " 
In tnxs example, the user wants to extract sales data oer 
clienc from a database, which contains the data cables" 
shown in Tables 20-23. For ease of interpreta- ^ on th^ 
binary coding is omdtted m this example. 



The user has specified the fcllowin: 

-unctions, for whir-:- t^c-,-- =-'r.-. i .j • " 

r=:Sa^^ snould oe part; 



mathematical 

ioned per 



^) "IF (Only (environment index) = ' I ' ) THEN 
Sum (NumiDer'^Pr- 0-=^ ) *2 c,-rr,/-T • ...r^ ■ 

"Avg (Number* Pri ce ) " 
The mathema-.ical function (a) specifies that ^h- 
saies figures should be doubled -"o- d-^-^m^-c. -^-^ w n 
o,_oa_^ ncx.'^..^ CXI. environmenr index of ^i' 

while the actual sa ■ es "--a-^-c, on^-^-^ ^ 

"^--^ --g-^es snouia oe used =or othe- 

produces. The macnematical funcMo- (b^ ^^s ^ ^ 

lor reference. 

-"^ this case, ^-e s-^^l^^ro-- 

' ^^i^cce^ --assiiicacion variables 

are "Environm.ent ind^v" z=r^H ^.^-^ • 

* " ^-^u ^_itz:nc / ana ciie selected 

calculacion variables are -m-^^-^^.-.. ...^ u^^.^ ^ 

c.-^ ^^^.^ r^rice" . Taoies 

^0, 22 and 23 are :.d'=-- f - -v,^ ^ 

^'"^^ — '^^^ ^^unaary "caoies, v;hereas 

Table 21 is iden^ ^' ^ - - ^^^^ , . " , . 

^~ ^ conneciing cao^e . Table 20 i=: 

- elected as starimg table. Thus, rne star.mo -ab^^. 

contains selecced variables (-:umber'^ -lie^i-. and . 

connecting variable ("Produc-'M ---^ r^--^ ^- ' ■ - T 
-] . , ^ „ . Cv^-inec ^ng variable 

imKs the scarcmg cable r^abi^ -.-^ -v,^ - 

^-<=^-t-'i- ^o -^ne oounca^-v 

tables (Tab~ie- 2^-0-^ ^ 

c^-.^^ ,1^ ^^^g conneccin- table (Tabl- 

^ ^ 21 ) . 



Next, the formation of che conversion structure will 

c . A first part 



be described wich reference to Tables 24-26. ... - >- 
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(Table 24) of zhe conversion structure is built by 
successively reading daza records of a first, boundary 
table (Table 23) and creating a link becv/een each unique 
value of the connecting variable (^^ Product group"") and a 
5 corresponding value of the selected variable ("Environ- 
ment index") . Sirr.ilarly, a second part (Table 25) of the 
conversion structure is built by successively reading 
data records of a second boundary table (Table 22) and 
creating a link between each unique value of the connec- 

10 ting variable ("Price group") and a corresponding value 

of the selected variable ('"Price") . Then, data records of 
the connecting table (Table 21) is read successively. 
Each value of the connecting variables ("Product group" 
and "Price group" , respectively) m Tables 24 and 25 is 

15 substituted for a corresponding value of a connecting 

variable ("Product") in Table 21. The result is merged in 
one final conversion structure, as shov;n in Table 26, 
Then, an intermediate data structure is built by 
successively x^eading data records of the starting table 

20 (Table 20), by using the conversion structure (Table 26) 
to incorporate the current values of the selected vari- 
ables ("Environment tndex" , "Client", "Nur:b)er" , "Price") 
in the virtual data record, and by evaluating each mathe- 
matical expression based on the current content of the 

25 virtual data record. 

For reasons of clarity. Table 27 displays the 
corresponding content of the virtual data record for each 
data record cf the starting table. noted in connection 

with the first example ^ only one virtual data record is 

3 0 needed. The content of this virtual data record is 
updated, i.e. replaced, for each data record of the 
starting table. 

Each data record of the intermediate data structure, 
as shown in Table 2B, accommodates a value of each 

35 selected classification variable ("Client", "Environment 
index") and an aggregation field for each mathematical 
expression implied by the miathemat ical functions. In this 
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case, the intern-.ed.a.e data structure contains two 
aggregation f.elds. One aggregatton field contains tHe 

aggregated resul:: -h- m-"'r--rT--- ^- - 

^.i^ rric^^nemo.^^c=i^ expression ( ^^x*v" 

^^--^-e^ --.cu-aricn variables 
~' "Nunber" "n--^ \ -.^ . 

' ' ^ counter of the number 

cr suc**^ on^"^=:-- --^-^^ ^ 

- . .n= ^=ycut ct this aggregation fielc 

IS aiven by -'►-■= far-- 

> tn«„ average cuantitv should be 

calculated ( "Avo ux^y ) " ) . The other agsregat.;n fi»ld s 
designed to hold tne lowest and h.gnest values of th^ 
• classification variable "Environnent index" for each 

combination of values of t-- cla^^^i f -■ ^ • 

ciassir ication variables. 

As in the first example, the intermediate data 

Structure (Tab~- ? ^ c -rM'-T- - 

.ic^.^ _s ou^j., oy evaluating the 

mathematical eyc-^=:c^io- f--- -'-.^ 

-.^-^^^lo-. r.^^ current content of 

virtual data record (each row in Table 27), and by "'^ 

aggregating the result in -h^ a-ornn-r-' - - o ^ 

-. _ , ^-?i^cpr^c.-_e aggregation 

tiela oased or -h^ '-nrnh- t-- ■; ^.r, ^- 

-n_ ^omb^nc^.ion or current values of the 

classification var-iab""=== ( '"-1 i o-n- " 

■L-^u_^s ( ^iienu , "Environment index") 

The interm^d-=r^ r^^-- ,_ 

^-_._...ure also includes data 

records m whicn the value "<ALL>" has been assigned to 
one or both of the classification variables. The corres- 

^ r_=ias contain the aggregated result 

wnen the one c-- rnor-^ r-i-oo^^.- 

mor. cl^ss.rication variables (dimen- 
sions) are eli~inateG. 

When the intermediate data structure has been built 
a final data structure, i.e. a multidimensional cube is 
created by evaluating the mathematical functions based on 

L,n- evaluated results of t"^p Tn^^h^rT^-^- 

^- ce matnerricxL.ical exoress^ons 

=oncained .n -.he incer.edia-.e data scr.«ure". Each data 

record of th'= f-i-.;=l h--= ^- 

un .=1 dcxua structure, as shown in Table 2 9 

accommodates a value of each selected classification ' 

variable ("c"'"'^nr" "r-— -r-; ^ 

• -*.^nt , il■.x^^lrcnment index") and an aqqre- 

gation field f--r osr>-^ rr-r^^^-.>- ■ i ^ 

^aun matnematical function selected bv 

the user. 

The final data structure is built based on the 
results m the aggregation fields of the intermediate 
data structure for each uni-que combination of values or" 
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the classif icacion variables. When function (a) is 
evaluated, by sequentially reading data records of Table 
28, the program first checks if both values in the last 
column of Table 28 is equal to ^I'. If so, zhe relevant 
5 result contained in the first aggregation field of Table 
28 is multiplied by two and scored in Table 29. If not, 
the relevant result contained in the first aggregation 
field of Table 28 is directly stored in Table 29. When 
function (b) is evaluated, the aggregated result of the 

10 mathematical expression { "x*y" ) operating on the selected 
calculation variables ("Number", "Price") is divided by 
the number of such operations, both of which are stored 
in the first aggregation field of Table 28. The result is 
stored in the second aggregation field of Table 29. 

15 Evidently, the present invention allows the user to 

freely select mathematical functions and incorporate 
calculation variables in these functions as well as to 
fx-eely select classification variables for presentation 
of the results. 

20 As an alternative, albeit less memory-efficient, to 

the illustrated procedure of building an intermediate 
data structure based on sequential data records from the 
starting table, it is conceivable to first build a so- 
called join table. This join table is built by traversing 

25 all data records of the starting table and, by use of the 
conversion structure, converting each value of each 
connecting variable in the starting table into a value of 
at least one corresponding selected variable in a 
boundary table. Thus, the data records of the join table 

30 will contain all occurring comcbinat ions of values of the 
selected variables. Then, the intermediate data structure 
is built based on the content of the join table. For each 
record of the join table, each mathematical expression is 
evaluated and the result is aggregated in the appropriate 

35 aggregation field, based on the current value of each 

selected classification variable. However, this alter- 
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20 

native procedure requires more computer memory to extract 
the requested information. 

It should be realised that the mathematical function 
could contain mathematical expressions having different, 
and conflicting, needs for frequency data. In this case, 
steps 104-110 (Fig. 2) are repeated for each such 
mathematical expression, and the results are stored in 
one common intermediate data structure. Alternanively , 
one final data structure, i.e. multidimensional cube, 
could be built for each mathematical expression, the 
contents of these cubes being fused during presentation 
to the user . 
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Appendix A 



Table 7 



Table 6 



Product 




Soap 


0 


Soft soap 


1 


Toothpaste 


2 


Shampoo 


3 



Price 




7.5 


0 


9.35 


1 


6.5 


2 


Table 9 



Table 8 



Date 




1999-01-02 


0 


1999-01-07 


1 


1999-01-08 


2 


1999-01-11 


3 


1999-01-12 


4 


1999-01-15 


5 



Client 




Nisse 


0 


Gullan 


1 


Kalle 


2 


Pekka 


3 


Jens 


4 



Table 11 



Year 




1999 


0 



Table 10 



Number 




3 


0 


5 


1 


8 


2 


2 


3 


10 


4 



Table 12 



Country 




Sweden 


0 


Denmark 


1 


Finland 


2 
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Table 13 



Oh 
1\- 
2V 
31- 

51- 



Table 14 



Product — ^ Price 

0\ 



1\- 
2V 



■nl 

■^f2l2l 



^5 Taib/e n 

Aggregation Md 

aient Year Price Number aient Year I Nuntei'Price 



0 






0 








0 


0 




0 








RiO 


0 


2 


0 




0 


0 


R20 


0 


2 


0 




0 


0 


R31 


0 


0 


1 




1 


0 


R4 2 


0 


0 


2 




2 


0 


R5 2 


0 


■2 


2 




2 


0 


RU 


■2 


0 


4 




2 


'2 


KIO 


0 


0 


2 




0 


0 



■1 0 Suni=Sum(0)^2Q.5m5^27.5+&)^0^O->128.5 

'1 -2 Sum=Sum(0)^75->75 

0 -1 Surr\=Sum(0)^2O.5i2O.5^0'>41 

1 -1 Sum= Sum (0)^ 37.5 ■> 27.5 

2 -1 Sum=Sum(0)*60*0->&) 
2 -1 Sum=Sum(0}^75->75 



'1 A Sum=Sum(0}^2O.5+2O.5i27.5^&)^0i75^0-> 212.5 
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Table 17 



Client 


Year 


ot/n? [NumDer x nncej 


Nisse 


1999 


41 


Gull an 


1999 


37.5 


Kalle 


1999 


60 


Pekka 


<NULL> 


75 


<ALL> 


1999 


138.5 


<ALL> 


<NULL> 


75 


Nisse 


<ALL> 


41 


Gultan 


<ALL> 


37.5 


Kalle 


<ALL> 


60 


Pekka 


<ALL> 


75 


<ALLA 


<ALL> 


213.5 



Table 18 

Sum (Price*Number) Per Client, Year 





1999 


<NULL> 


<ALL> 


Nisse 


41 




41 


Gullan 


37.5 




37.5 


Kalle 


60 




60 


Pekka 




75 


75 


<ALL> 


138.5 


75 


75 



Table 20 



Date 


Product 


Number 


Client 


1998-12 -20 


B 


5 


Nisse 


1999-02-05 


A 


7 


Kalle 


1999-02-06 


B 


9 


Kalle 



Table 21 



Product 


Price group 


Product group 


A 


4 


Z 


B 


3 


X 



Table 22 



Table 23 



Price group 


Price 


3 


5.5 


4 


3.5 



Product group 


Environment index 


Legal status 


X 


1 


YES 


Y 


IX 


NO 


Z 


II 


YES 
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Table 24 Table 25 

Product group->Environment index Price group ->Price 

X »► / 3 »► 5.5 

Y ► IX 4 3.5 

Z ► // 

Table 26 

Product->Price, Environment index 

A 3.5.11 

B ^ 5.5,1 



Table 27 



Client 


Environment index 


Number 


Price 


Nisse 


1 


5 


5.5 


Kalle 


II 


7 


3.5 


Kalle 


1 


9 


5.5 



Table 28 



Client 


Environment index 


I-Numberx Price 


I-Environment index 


Nisse 


1 


Ix: 27.5, N: 1 


First: 1, Last: 1 


Kalle 


II 


Ix: 24.5, N: 1 


First: II, Last: II 


Kalle 


1 


Ix: 49.5, N: 1 


First: 1, Last: 1 


<ALL> 


1 


Ex: 77, N:2 


First: 1, Last: 1 


<ALL> 


II 


Ix: 24.5, N: 1 


First: II, Last: II 


<ALL> 


<ALL> 


Ix: 101.5. N: 3 


First: 1, Last: II 



Table 29 


IF (Only (Environmsnt index}-!', 
Sunil^uni)er*Phce)*ZSum(t^un}b&*Price)) 




Client 


Environment index 


Avg(Numt)er*Price) 


Nisse 


1 


55.0 


27.5 


Kalle 


II 


24.5 


24.5 


Kalle 


1 


99.0 


49.5 


<ALLA> 


1 


154.0 


38.5 


<ALLA> 


II 


24.5 


24.5 


<ALLA> 


<ALL> 


<NULL> 


33.8 
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CLAIMS 

1. A method for excracting information from a 
5 database, which comprises a number of data tables 

containing values of a number of variables, each data 
table consisting of at least one data record including at 
least two of said values, said information being 
extracted by evaluation of at least one mathematical 
10 function operating on one or more selected calculation 
variables, said extracted information being partitioned 
on one or more selected classification variables, 
characterised by the steps of : 

identifying all data tables containing at least one 
15 value of one of said selected variables, such data tables 
being boundary tables; 

identifying all data tables that, directly or 
indirectly, have variables in common with said boundary 
tables and connect the same, such daca tables being 
20 connecting tables; 

electing a starting table among said boundary and 
connecting tables ; 

building a conversion structure that links values of 
each selected variable in said boundary tables to 
25 corresponding values of one or more connecting variables 
in said starting table; and 

evaluating said mathematical function for each data 
record of said starting table, by using said conversion 
structure to convert each value of each connecting 
30 variable into at least one value of at least one 
corresponding selected variable, such that said 
evaluation yields a final data structure containing a 
result of said mathematical function for every unique 
value of each classification variable. 
35 2. A method as set forth in claim 1, charac- 

terised by the further step of presenting relevant 
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15 



parts of said resulting data structure tc the user in 
hurr.an - readab 1 e form . 

3. A meuhod as set forth in clairr. 1 or 2 , cha- 
racterised by the further step of initially 
reading said daca records of said database into the 
prirr-iary memory of a computer. 

4 . A mernod as set forth in ^ay-nnc cf the Jiny 
--=^tB±m^ characterised by the further step of 
initially assigning a different binary code to each 
untque value of each data variable in said database and 
storing the data records in binary- coded form. 

5. A method as set forth in a^agl^ e cf th ^ pr-eeea^ 
cl-a-im-s-,- characterised by the further steps of 
initially identifying all data tables m said database 
that have variables in common, and assigning virtual 
connections between such data tables, thereby creating a 
database with a snowflake structure, wherein said 
connecting tables are located between said boundary 
tables in said snowflake structure.^ 

^- - "^---"-od as set forth in a Sy'onc o ^-fefee~p^-eoa^a 
ei^ms;- characterised by the further steos of 
identifying all calculation variables for which the 
number of occurrences of each value is necessary for 
correct evaluation of said mathematical function, 
25 defining a subset of data tables consisting of boundary 
tables containing such variables and data tables 
connecting such boundary tables, electing said starting 
table from said subset, and including data on said number 
of occurrences of each value in said conversion 
3 0 structure. 

7. A method as set forth m a^t" ono of tne p.^ub^di ng 
oi^itns, characterised in that said starting 
table is the data table among said boundary and 
connecting tables having the largest number of data 

35 records. 

8. A method as set forth in ^/ - "^one cf the p. o u ^ Jiii y 
c-1^4mB>^ characterised by the further step of 
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building said final data structure, which includes a 
number of data records, each of which contains a field 
for each selected classification variable and an 
aggregation field for said mathematical function, wherein 
5 said building step includes sequentially reading a data 
record of said starting table, creating a current 
combination of values of said selected variables by using 
said conversion structure to convert each value of each 
connecting variable in said data record into a value of 

10 at least one corresponding selected variable, evaluating 
said mathematical function for said current combination 
of values, and aggregating the result of said evaluation 
in the appropriate aggregation field based on the current 
value of each selected classification variable. . 

15 9. A method as set forth in any n^p,..nf r.-lHarirms — i— ¥-r 

characterised by the further step of creating a 
virtual data record containing a combination of values of 
said selected variables, wherein said creating step 
includes reading a data record of said starting table and 

20 using said conversion structure to convert each value of 
each connecting variable in said data record into a value 
of at least one corresponding selected variable. 

10. A method as set forth in claim 9, charac- 
terised by the further step of building said final 

25 data structure which includes a number of data records, 
each of which contains a field for each selected 
classification variable and an aggregation field for said 
mathematical function, wherein said building step 
includes sequentially reading a data record of said 

30 starting table, updating the content of said virtual data 
record based on the content of each such data record, 
evaluating said mathematical function based on said 
updated virtual data record, and aggregating the result 
of said evaluation in the appropriate aggregation field 

35 based on the current value of each selected 

classification variable in said updated virtual data 
record . 
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11. A method as set forth in claim 9, c h a r a c - 
cerised by the furcher step of building an 
incermediace data srructure which includes a number of 
data records, each of which contains a field for each 

5 selected classification variable and an aggregation field 
for each mathematical expression implied by said 
mathematical function, wherein said building step 
includes sequentially reading a data record of said 
starting cable, updating the concent of said virtual data 

10 record based on the content: of each such data record, 
evaluating each mathematical expression based on said 
updated virtual data record, ,and aggregating the result 
of said evaluation in an appropriate aggregation field 
based on che currenc value of each selected 

15 classification variable in said updaced virtual data 
record . 

12. A method as set forth in claim 11, charac- 
terised in that said step of building said 
intermediate data structure includes: 

20 eliminating one of said classification variables in 

said intermediate data structure by aggregating said 
results over all values of said one classification 
variable for each unique comibinaticn of values of 
remaining classification variables, by creating 

25 additional data records, and by incorporating said 

aggregated results in said additional data records of 
said interm.ediate data structure. 

13. A method as set forth in claim 11 or 12, cha- 
racterised by the further step of evaluating said 

30 mathematical function based on said results in said 

aggregation fields for each unique combination of values 
of said classification variables, thereby building said 
final data structure. . / 

c ■' - ■ 

14 . A method as set forth in a^^y-onc of-~t4^ 
3 5 p3ferred-iTtg--elrar-i^s-, characterised in that said 
step of building said conversion structure includes: 
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a) reading data records of a boundary table, and 
creating a conversion structure including a link between 
each unique value of at least one connecting variable in 
said boundary table and each corresponding value of at 

5 least one selected variable therein; 

b) moving from said boundary table towards said 
starting table; 

c) if a connecting table is found, reading data 
records of said connecting table, and substituting each 

10 unique value of said at least one connecting variable in 
said conversion structure for at lease one corresponding 
unique value of at least one connecting variable in said 
connecting table; and 

d) repeating steps (b) - (c) until said starting cable 
15 is found. 

15 . An article of manufacture comprising a computer- 
readable medium having stored thereon a computer program 
for effecting the steps of a method for extracting 
information from a database as set forth in any one of 
20 the preceding claim.s , 
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Sum(Price*Number) per Client,Year 



B 



Table 1 



Product 


Pnci 


Part 


Soap 


7.5 


• 


Soft soap 


9,35 


• 




6,5 


lube 




6,5 


Cap 



Date 


Client 


Product 


Number 


19mi'02 


Nisse 


Toothpaste 


3 


im-01-07 


Gullan 


Soap 


,5 


1999-01-08 


Kalle 


Soap 


8 


1999-01-11 


Kaile 


Shampoo 


2 


1999-01-12 


Pekka 


Soap 


10 











Table 3 



Date 



1999-01-02 



1999-01-07 1999 January 7 



1999-01-08 1999 January 



1999-01-11 1999 January 11 



1999-01-15 I 1999 \Januarv\ 15 



Month Day 



January 2 



Table 4 



Client 


Country 


Nisse 


Sweden 


Kalie 


Sweden 


Jens 


Denmark 


Pekka 


Finland 




Fig. 1 



Countiy 


Capital 


Population 


Sweden 


Stockholm 


9000000 


Denmark 


Cooenhaaen 


5000000 
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Fig. 2 



Read data records from database into primary 
memory 

Create internal table structure and store binary- 
coded data values in table structure 



102 



103 



Identify connections between data tables of 
'table structure 



Jnput mathematical function, calculation 
variables, and classification variables 

L 1 



104 Identify relevant data tables 



105 



106 



' Select starting table 



^ Build conversion structure 



107^ 
108 



Read data record from starting table 



Update virtual record with current values of 
selected variables, by use of conversion structure 



109 



Build intermediate data structure based on 
content of virtual data record 



111 



112 




NO 



Build final data structure from intermediate data 
structure 



Present result to user 
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